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1. Introduction 

Information geometry is the study of the natural differential structures which arise on the 
space of families of probability density functions. The Fisher information metric defines 
a notion of the distance between two particular members of a family of probability den¬ 
sity functions and is the natural measure arising out of the small change expansion of the 
Kullback-Liebler divergence [1] . The existence of such a distance measure is of obvious util¬ 
ity for answering questions related to, for example, the mutual information of two systems 
described by different probability density functions, the likely error made in approximating 
one distribution by another, and even a definition of a gradient descent algorithm consistent 
with the differential geometric structure of a probability space |2]. 

The study of information geometry was first expounded upon in detail by Shun’Ichi Amari 
and the foundations were laid out in [3]. A great deal is now known about the geometric 
properties of information manifolds. In particular, given a family of probability density 
functions, the associated Fisher information metric may be stated as a concrete integral (or 
sum in the case of discrete variables). However, comparatively little is known about the 
‘reverse’ operation. That is, given a Riemannian metric tensor, what can be said about 
the family of probability density functions which are naturally endowed with such a metric 
tensor? In this short note we show how one can, in theory, perform this inverse process and 
observe that it is far from one-to-one. 
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Our interest in the subject is not from the point of view of machine learning or information 


theory as such. In recent years, a new link has surfaced between information geometry and 


the study of space-time as an emergent phenomenon. Within string theory there has been 
much work over the last 15 years in the study of how the dynamics of interacting gauge 
theories in the limit of a large number of gauge degrees of freedom can give rise to emergent 
spacetimes of a variety of geometries. The most natural such structure arises out of a 
scale-free gauge theory providing, holographically, an anti-de Sitter space |1] - the so-called 
AdS/CFT correspondence. Coincidentally, the Euclidean version of anti-de Sitter space (a 
hyperbolic geometry) is a geometry which emerges frequently from a large class of different 
probability density functions. Indeed in the construction used by Hitchin [S], such a space 
arises naturally out of symmetry arguments when the Fisher information metric tensor is 
computed from the instanton moduli space in such gauge theories. In [6] these two ideas 
were tied together, showing how Information Geometry seemed to give a natural means for 
calculating emergent geometries in an AdS/CFT context. Interesting relationships between 
information geometry, quantum information and string theory/holography have been studied 
also in [7], [8], |9] and [TO] . 

In what follows, we explore in more detail the link between information and geometry. 

2. The Fisher information metric 

2.1. Families of probability density functions and their associated 
geometries 

For the purposes of this work, we will assume a narrow definition of a family of probability 
density functions. That is, when we write ‘family of probability density functions’ we will 
mean a family of continuous functions Pg : X ^ M. for some domain X C M”, parameterised 
over 9 E M G (ie. an m-parameter family of distributions). Coordinatizing X by 

X = (x^,..., x”) and the parameter space M hy 9 = {9^, , 0™), we will also further require 
dP 

that daPe '■= tvw is continuous on X for all 9 G M. Furthermore, we will also require that 
09 °- 

every member of the family be normalised, that is. 



All of this may be succinctly restated as {Pg} being a parametrised family of normalised, 
continuous functions which changes ‘smoothly’ over parameter space. Finally, we will refer 
to X as the spatial domain and M as the parametric domain, and conventionally associate 
the spatial domain Xi to probability density function Pi. 
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We now define the Fisher Information metric tensor on a hnite dimensional statistical 
manifold. Given such a manifold, M, whose points form a family of probability density 
functions with the properties listed above, there exists a Riemannian metric tensor on M, 
viz., 

gab{,d) = f P{x-,9) da\nP{x-,9) db\nP{x;6)dx. (2.1.1) 

J X 

The central question addressed in this paper may thus be stated as: given a Riemannian 
metric tensor g, under what circumstances can a family of probability density functions P 
be found such that the Fisher information metric tensor of P is g. 

2.2. Some examples 

In order to build some intuition for the relationship between a family of probability density 
functions and their associated metrics, we give here two examples of the computation of the 
Fisher metric. 

2.2.1. Univariate Normal Distribution 

Here the family of probability density functions is given by 



1/£^\2 


The distribution is parameterised by /i and a, which we will collectively denote 9. Put 
another way, the manifold coordinates are given by 6^ = and the random variable is 

X G M. Note that the parametric domain is M x In order to compute gab we must 

compute dalmP 




Then, using Equation 2.1.1 the Fisher metric for the univariate normal distribution has 



Thus we see that the Fisher metric, in this case, describes the metric tensor of a two- 
dimensional hyperbolic geometry. The structure on this geometry can be intuitively under¬ 
stood by the properties of normal distributions. In particular, for distributions with a ^ 1, 
the associated ‘difference’ between two distributions with means g,i and ^2 is less pronounced 
- they are harder to distinguish. For two sharply peaked distributions {a -C 1) with even 
similar p, the difference will be very pronounced and so they are easy to distinguish. Hence 
the hyperbolic nature of the space. 
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2.2.2. Cauchy Distribution 


The family of probability density functions for this distribution is given by 


P{x]Xo,j) 


1 7 

TT [7^ + (x — Xo)^ 


Thus, the parameter space for this family is spanned by the parameters 9 = (xq, 7 ) G Mx 
and the calculation of the logarithmic derivatives gives 


In P = In 7 — In [ 7 ^ + (x — Xq)^] — In tt, 

7.i„p = i_ h _. 

Oxq 7 ^ + (x — Xo)^ ’ (97 7 7 ^ + (x — Xo)^ 

As such, it is a simple matter to verify that the Fisher metric for the Cauchy distribution is 
given by 

Sab ,2 ^ f 

= V ^ = 2 f 7^ 

The reader may wish to note that while we started with a very different distribution, the 
geometric structure described by its Fisher metric is very close to that of the normal distribu¬ 
tion. In this sense, hyperbolic spaces (or Euclidean anti de-Sitter spaces) appear ubiquitous 
in an information geometric context. 



3. Reversing the Fisher information metric 

It is not clear at first glance that it is at all possible to reverse the process of computing the 
Fisher metric in any meaningful way, as the exercise involves a dehnite integral of multiple 
powers of the underlying family of probability density functions. We present below a moti¬ 
vating example to suggest that under certain, constrained situations such a process is indeed 
possible. As a prototype for a more general construction, we demonstrate how to encode 
the metric tensor of S”, for any n G N, in a family of one dimensional probability density 
functions. 

3.1. The n— dimensional sphere, 

We begin our exploration of reversing the Fisher information computation with a one¬ 
dimensional family of probability density functions. In particular, we leverage the properties 
of orthonormal functions to produce a family of probability density functions which, with 
an appropriate set of functions h*, give rise to the metric tensor of S'"". 

Note that, for our purposes, a family of univariate, real-valued functions {fi{x)}i^i is 
said to be orthonormal with weight w{x) over a domain X if fi{x) fj{x)w{x)dx = Sij. 
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Proposition 3.1. Let M C 'ML and G C^{M) such tha^ {WO G M)h^h^5ij = 4 and 
{fi{^)}i be a set of orthonormal, real-valued functions with positive semidefinite weight w{x) 
over X C M. Then the family of probability density functions 


P{x;e) 


1 

4 




2 

w{x), 


(3.1.1) 


gives the Fisher information metric tensor gab = {daF){dbh^)6ij. 


Proof. That P is normalised follows trivially from the orthonormality of /j. 

1 \ ^ p / n n 

Y, h\e)Mx)j w{x) dx = -J^\^Yl Py fifj 1 wdx 

1 ^ ^ 

4 


iL 


IL IL n -| 

i. Ph^ fifjW dx = --h^h^Sij 

i=i j=i 


!=1 J = 1 

= 1 . 


A straightforward computation gives the desired result. 


9ab= / P{dalnP){dblnP)dx 

Jx 

( 




T.{dah^)fi 

i=l 

n 

E 

\ i=l 




/ 


2=1 


E h^fi 

\ i=i 


dx 


/ 


EE/ {daP){dbh^)fifjwdx = {daP){dbP)Sij. 


i=i j=i 


□ 


Now we pause to note that we may view the above statement, gab = {dah^){dbh^)dij, as 
the result of applying the transition functions h to the flat Euclidean metric 5. As such, and 
noting that we required Ph^dij = 4, we immediately infer that 


Corollary 3.2. The metric tensor of EL can be reached as the Fisher Information metric 
of the distribution Eguation 3.1.1 where h is the transition function from E"" to dS"", the 
n-dimensional sphere of radius four. 


In the above we have shown a general way to find a given metric tensor in terms of the 
transition functions from fiat Euclidean space to a desired geometry. However, there is a 
specific condition on the P given by Pp = 4 which constrains these strongly. In what 
follows, we will generalise this result in a way which will remove this constraint. 

^Here we use Einstein summation and the lowered and raised indices have no differential geometric 
interpretation other than to aid in the appropriate summations 
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3.2. The Gaussian construction 


Now that we have reason to believe that it is possible, at least in special cases, to pick a metric 
tensor and construct a family of probability density functions whose Fisher information 
metric is the selected metric, we attempt to extend our results to arbitrary Riemannian 
metrics. 


Consider a family of probability density functions given by a product of n, uncorre¬ 
lated, disjoint, one-dimensional Gaussian probability density functions with unit variance. 
Explicitly, 


P{x-,9) 




n 




i=l 



(3.2.1) 


where M, the parametric domain, is not yet fixed, X = M"", and h® G C^{M). From this, we 
may compute the Fisher information metric as follows 


9ab 


-1 


\/l2i 


dx e »=i 






dx e *=i 


i=i 

,2 / n 


k=l 

2 


Try' Jx 




0=1 

oo 


vanishing \ 
cross-terms j 


It is a simple matter to complete the computation to obtain 

gab = {dah^){dbh>^)Sjk. 


(3.2.2) 


This result allows us enough flexibility to be able to always give an h and M such 
that gab may be constructed as desired. In particular, we may begin at |Equation 3.2.2| and 
read backwards to find |Equation 3.2. 1[ In doing so, we £x a desired gab and accompanying 
manifold Ai, and attem pt to realise an h and M for which Equation 3.2.2 would hold. Unlike 
the case of Proposition l3.lL which came with the constraint Phi = 4, this process is here 
always possible. 

The Nash Embedding Theorem m tells us that there is an n G M such that {Ai,g) may 
be isometrically embedded in (E”,(j). Specifically then, it tells us that there exists an 
h such that g = h*5. As such, interpreting Equation 3.2.2 as the statement that g is the 
pullback of 5 via h we see that we need only select an n large enough to accommodate the 
Nash embedding of the desired manifold Ai in E” (which is always possible) and we have 
h and M to satisfy the arrangement. Consequently, we have a family of probability density 
functions, given by Equation 3.2.1 whose Fisher information metric is the desired, arbitrary 
Riemannian metric. 
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Said another way, [Equation 3.2.2| states simply that gab is the pullback from a higher 
dimensional flat space to a manifold embedded in that space, via h. In the case of coincidence 
of dimensions between g and h, the result bears the simple interpretation of h acting as a 
set of transition functions from 5 to g. 


3.2.1. The metric of 


To cement the understanding of the importance and generality of jEquation 3.2.2] we construct 
the metric tensor of Suppose we desire a family of probability density functions whose 
Fisher information metric is the metric tensor of Specihcally, if the unit sphere has line 
element 

ds^ = (19“^ + sin^ 

then we can proceed as outlined above, and write down a set of transition functions 

h = (cos 6 sin 0, sin 6 sin 0, cos 0), 

from to the embedded Applying the construction ofjEquation 3.2.I]we hnd 


P{x y z-6 (p) = 


This is easily recognisable as a product of three Gaussian probability density functions, 
each with a mean which is periodic in the parameters. This means that we have the geometry 
and topology of a sphere, where each point on the sphere corresponds to a three dimensional 
Gaussian distribution with unit variance and mean denoted by the point on the sphere. This 
exercise can be performed for any by simply forming the appropriate h. 


The ease with which we are able to perform this construction is indicative of the power 
underlying Equation 3.2.2 and the accompanying statement that any Riemannian metric 
tensor may he reached via this construction. 


3.3. The hyperbolic secant construction 

In the previous subsection we gave a construction based upon a product of Gaussian prob¬ 
ability density functions and demonstrated its flexibility. Now we demonstrate that the 
above-mentioned results are just as achievable with an entirely different family of probabil¬ 
ity density functions. Gonsider the family 

p = ~Y\. ~ . 

i=l 
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Other than the functional dependence on ~ x* — h*, this is entirely different from the 
Gaussians discussed earlier. However, computing the Fisher information metric we hnd the 
result to be of that most general form 

9ab = {dah'^){dbh^)Sij. 

Naturally, this bears the same interpretation as the previous result and serves to suggest that 
relatively little of the information about the original family of probability density functions 
is carried through to the metric tensor itself. 

The careful reader will note that we now have two means to the same end, and may 
wonder just how many more ways we may achieve the above result. Indeed the following 
section serves to introduce a general framework which will show that the answer is that 
there is an inhnite-fold degeneracy in the construction, and thus there is always an inhnite 
to one mapping between families of PDFs and Riemannian metrics via the Fisher information 
metric. 


4. General results 

In this section we will elaborate on a more general set of statements which allow for dehnitions 
independent of dimensionality and functional dependence of the parameters of the PDF in 
question. We begin by showing how to construct a family of spatially disjoint probability 
density functions out of individual families of probability density functions. 

Definition 4.1. The spatially disjoint product of two families of probability density func¬ 
tions on the same parametric domain. Pi = Pi {x^,... : Xi X M —)■ M and 

P 2 = p 2 (x^, ..., x”; 0) : X 2 X M —)■ M, is dehned as 

{PlQP 2 ){x\..,X^+^-d) = Pi{x\ . . . ,x'^-e) ■ P 2 {x^^\ . . . ,X^+^-d). 

Note that PiQ P 2 '■ {Xi x X 2 ) x M —)■ M and we write F®” where we mean P- 

Given this, we will here show how a special property of spatially disjoint products un¬ 
derpins all the general results achieved in this work. That is, the Fisher information metric 
transforms the spatially disjoint product of probability density functions into a sum of their 
corresponding, individually considered metric tensors. 

Theorem 4.2. If P = P{x]6) is a probability density function with a decomposition 
P = 0 P^^' for some Pi and G N+ then gab (O P?""') = E G5'a6(^i)- 

Proof. Let us rewrite P = (f) P®®* = 0 Pj where each Pi has been accumulated into the 
spatially disjoint product Cj times, that is, Pj = Pj for e* many j. Then, in order to compute 
g{P) we expand logarithmic derivatives to arrive at 

9ab{P) = EEf dx ^—{daPi){dbPj). 

i i 



To proceed we must evaluate the double sum, and to do so we examine the cases j = i and 
i ^ i separately. In the event of the latter, j ^ i, we have 

f dx-^{damdhP,) = ( f dx^■■■dx’^^aP^) ([ dx^-^-dx^'d^Pj 

where we have expanded the integral as a product over its disjoint spatial domains and have 
suppressed all other terms as they were of the form da;“ • • • dx^Pi = 1. Moreover, we note 
that Pi satisfies the conditions (by the definition of the probability density function) for the 
exchange of integral and derivative and so 




dx“ • ■ • dx’^daPi 


da f dx“ • • • dx’^Pi 
Jxi 


^a(l) 


0 . 


Thus contributions from terms where j ^ i is zero. On the other hand, the cases for which 
i = j admit simple resolution as 



dx 


P 

P^i 


{daP^{d,P,) 



dx“---dx'=(a,p,)(abPi)^ 

i 


9abiPi) ) 


where again we have expanded the integral as a product and suppressed all terms whose 
integral was one. Finally, we recall that we had exactly e* many Pj such that Pj = Pi and 
so we collect e* many such contributions of gab{Pj)- □ 


Remark 4.3. That we essentially require Mi = M 2 = M in the definition of the spatially 
disjoint product is a matter of some subtlety. Consider that if Mi 7 ^ M 2 we would be within 
reason to set M = Mi x M 2 and reinterpret the dehnition as 


(Pi o P 2 ){x^ ..., x"+^ e, 0) = Pi{x\ ..., x"; 0) ■ P 2 (x"+\ 


, X 


n+k. 


;0)- 


In this case, however, g{P) is not strictly the sum of g{Pi) as the latter may all be of different 
dimension. Simply re-interpreting Pi to have enlarged parametric domain M will not solve 
this problem as then it may happen that g{Pi) will no longer be non-degenerate and so not a 
metric tensor. Thus, the direct ability of the above result to “glue” together disjoint metric 
tensors is apparent, but nuanced and not an immediate consequence of the exposition given. 


In effect then, care should be taken when examining the statement g{(J) Pi) = g{Pi) so 
as to ensure that it is done with the understanding that g{Pi) is to have zero entries where 
appropriate for the purpose of the sum, but not when considered as its own metric tensor. 
More formally, we could write g{QPi) = '^g{Pi) where g is expressed precisely as g, but is 
extended to all of M as suggested above, and is free from interpretation as a metric tensor. 
Hereafter, it is taken for granted that such nuances are appreciated by the reader. 0 


The importance of Theorem l4.2l cannot be overstated. From here on, it is simply a 
matter of hnding convenient forms of gab{Pi) for some parameterisation of P* so that we 
may take 0 Pj and arrive at a desired metric tensor. That is, if we can hnd a Pj such that 


9 





gab{Pi) oc {daP){dbP) then we can take P = Q Pi to find gab oc {daP){dbh^)Sij by the above. 
Here, the whole is more than the snm of its parts - given gab cc {daP){dbh^)Sij we are able 
to find an h for onr desired manifold and then create a desired P ont of constitnent Pi, 
each containing some part of {P}- Beginning with disjoint Pi, however, the qnalities which 
the individnal distribntions shonld exhibit, to attain a given g, are not clear. Fnrthermore, 
we note here that while 0 Pi will yield the desired result, if we find multiple families of 
probability density functions, we may equally well combine them to achieve the same result. 

Thus, what we really seek are simple forms of functional dependence of families of prob¬ 
ability density functions upon our set of differentiable functions h so that explicit computa¬ 
tions may be made. Recall that we saw, in the calculations in subsections 13.21 and 13.31 that 
we may leverage reparameterisation invariance of spatial domains to our advantage. Such 
symmetries of the spatial domain allow us to essentially eliminate any functional dependence 
of the integrals upon the h® and produce multiplicative factors of dah in the process. To that 
end, we explore a generalisation of the symmetry used in the above-mentioned subsections. 

Proposition 4.4. Fix a one-dimensional probability density function P{x) on X for which 
X remains invariant under the change of variables y = f{x-,6), for some differentiable 
family of diffeomorphisms f : X x M ^ X (the parameter space is M) and let P{x;9) = 
fxix] 6)P{f{x-, 9)) such that daP ^ 0 where we write fx for and fa for daf ■ Then 



difafb) , , ,dlnP(2/)\ dPiy) 
o ' JaJb 1 I T 
oy dy j dy 


(4.4) 


where we assume that we have written all functions in terms of y = f{x;9) using the expres¬ 
sion X = f~^{y;9) where necessary. 


Proof. We first check that P{x;9) = fx{x;9)P{f{x;9)) is normalised. To that end, let 
y = fix-,9) 


lx 


Pdx = [ fxPdx = [ fxP^ = 1. 

Jx 


IX 


IX 


Then we compute the logarithmic derivatives necessary for the Fisher information metric 

1 ( dPjf) ^ 


da In P = 


-ifafx) + Pif)ifax) 


fxPif) V 

We proceed with the computation by making the change of variables y = /(x; 9) 


dab 


( dPjy) 


X ifx)^Pif) \ dy 

dP{y)d\nP{y) faxfb. 


ifafx) + Piy)if ax) 


= / .fafb 


IX 


dy dy 


+ 


ifxf 


■Piy) 


( dPjy) 

\ dy 

fafbx + fbfax\ dPiy) 


fx 


ifbfx) + Piy)ifbx) I dy 
dy. 


dy 


Finally, we recognise that ^ and that fafbx + fbfax = and collect terms to 


arrive at the result. 


-dy 


dx 


□ 
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Of course, examining symmetry at such an abstract level cannot be exp ected to yield 
concrete answers immediately and so that the statement of Proposition l4.4l is opaque and 
not obviously useful is not surprising. Indeed, in what follows we make various simplifying 
assumptions about the functional form of the symmetry function / to arrive at generalisations 
of familiar results. 


We begin by noticing that there is a term in Equation 4.4 which is proportional to fafb- 
If it could be arranged that fafb be independent of y, then we could simply extract a term 
proportional to fafb from the result - a term whose importance we already know. Moreover, 
if we could ensure that the other terms vanish, we would have gab oc fafb and achieve our 
general result once more. 

To that end, we choose to require that fx be constant and fax = 0. Although this is 
likely not the only way to achieve our desired effect, it will certainly suffice. In this case, we 
see immediately that /(x; 6) = cx + h{6) is the general solution - but this is nothing other 
than the statement o f tra nslation invariance. Thus, we may achieve the following results by 
means of Proposition 


4.4 


Proposition 4.5. Fix a one-dimensional probability density function P such that the change 
of variables y = x — h for h{9) a differentiable function on M C leaves the spatial domain 
X unchanged. Let P{x;9) = P {x — h) then gab = {dah){dbh)D where 


D = dx 

Jx 


dP{x)\ /c)lnP(x) 
dx J \ dx 


Proof. Apply Proposition l4.4l to /(x; 9) = x — h{9). 


□ 


Corollary 4.6. Fix one-dimensional probability density functions Pi and let F{9) be differ¬ 
entiable on M d and write = x* — F such that Xi is unchanged under this change of 
variables for all i. P{x;9) = (f) Pi (x* — gives gab{.P) = {daF)idbF)Dij where 


D^J = 



/ d\nPi\ 

V ax* ; ’ 


Proof. Combine Proposition l4.5l and Theorem . 


i = j 

i + j 


□ 


Remark 4.7. When Pi are all Gaussian, Dij 
as a special case. 


6ij and so the result of [Equation 3.2. 2| follows 

0 


To demonstrate how one might achieve the encoding of an arbitrary Riemannian metric 
tensor into a spatially disjoint product of one-dimensional families of probability density 
functions, consider the following example. 


11 













Example 4.8. Suppose we desire a hyperbolic metric tensor g whose associated line element 
is given by ^(da^ + d/3^), on the open subset M = {(a, / 3 ) e | /5 > 1 } C With some 
work, it can be shown that an isometric embedding of M into can be achieved through 
the function 

■/p 


h = 


COS a sm a 


, In + \//32 _ _ 




’ /? 

That is, g = h* 6 . Moreover, it is evident that h is at least so we may apply our 
construction to it and write, for example, 

P = Pi{x- h^) © F 2 ( 2 / - h^) QPsiz- h^) , 

for any one-dimensional p roba bility density f unc tions Pi which satisfy translation invariance 
as outlined in Proposition l4.5l . By Corollary l4.6l we then know that g{P) = h*D and so the 
result follows in the case that D = S. 


In particular then, we may choose to let Xj = R for i G {1, 2, 3} and put 

, Pyix) = — sechx, P'^ix) = 

TT 


Pi{x) = 




TT (1 -|- xP ’ 


for which Di = 1 and D 2 = D 3 = ^. Thus, taking the values of Di into account, we may 
write P{x, y, z; a, /?) = Pi {x — h}) 0 A (?/ — \/2h^) Q P 3 [z — \/2/i^) to recover 


P{x,y,z-,a,/3) = 


^sech(x- 


1 + 


2; ^ In + a/Z - 


2 ’ 


dehned on R^ x M, and for which we know, due to Corollary l4.fil. the metric tensor is 
g = It may also be verihed directly that, given. 


Pi(x; a,/3) = —;=e 2 © 


Psix; a, (3) = 


, P 2 ix;a,( 3 ) = — sech x 

TT ' 


\/2si 


sm a 


TT 


-1 


1 + 


x + -V2ln (^f3 + 


■2 , we have 


^(^1) ^4 


sin^ a /9 sin a cos a 
/3 sin a cos a cos^ a 


> ^(^ 2 ) = ^ 


fP cos^ a —jS sin a cos a 
-jS sin a cos a 


sin^ a 


9{PP = ^ 


b 0 
0 Z -1 


whose sum is as desired - that is, g (Q Pi) = Yh 9 {Pi) Theorem l4.2l assured us. Thus, we 
have managed to encode a desired metric tensor as the Fisher information metric of a spatially 
disjoint product of three, one-dimensional families of probability density functions. A 
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We can explore another possible simplifying form of transformation /. Consider that 


were /(x; 9) oc x, then every term in Eqnation 4.4 wonld contribnte a factor proportional to 
fa- Again, this is a desirable resnlt and so we explore the symmetry of scale invariance. 

Proposition 4.9. Fix a one-dimensional probability density function P such that the change 
of variables y = xe^ for h{9) a differentiable function on M d leaves the spatial domain 
X unchanged. Let P{x;9) = e^P (xe^) then gab = {dah){dbh)E where 

Ip{x) I dx. 


dx 


X 


Proof. We set f{x;9) = e^^^^x and compnte the reqnired derivatives for Proposition l4.4l as 
follows 

..h f _ r _ a difafb) 


fa = dahxe , fx = e , fax = dahe , 

The resnlt follows straightforwardly. 


dy 


= 2{dah){dbh)y. 


□ 


Corollary 4.10. Fix one-dimensional probability density functions Pi and let P{6) be dif¬ 
ferentiable on M d and write y* = x*e^* such that Xi is unchanged under this change of 


variables for all i. P(x; 6*) = Q (x*e^” 


©ei 


gives gab{P) = {daP){dbh^)Eij where 


, Ci / dx*P* 1 + X- . 

p., = <; Jx. V dx^ 
0 , 


d In Pi 


^ = j 


Proof. Combine Proposition l4.9l and Theorem |4.2 


□ 


Corollary 4.11. Every Riemannian metric tensor may be reached as the result of the Fisher 
information metric acting upon a spatially disjoint product of families of one-dimensional 
probability density functions. 


Proof. Apply either Corollary l4.inl or Corollary l4.6l to the desired pnllback h, which 
exists dne to the isometric embedding of the desired manifold in E"’ via the Nash Embedding 
theorem. □ 


It can now be se en t hat relatively simple compntations give rise to highly nsefnl results 

Indeed, to extend this work one need only End other families of 


by way of Theorem 


probability density functions whose Fisher information metric can be made to be proportional 
to {dah){dbh) in order to combine them in the requisite multiplicity to allow h to be the 
pullback for a desired Riemann ian metric tensor. That we made explicit use of spatial domain 
symmetries using Proposit ion l4.4l should be seen as merely a convenient and intuitive way 


of making use of Theorem l4.2l to construct desirable results 
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5. Discussion 


That we can associate a Reimannian information manifold with a well-dehned metric to a 
given family of probability distribution functions is a remarkable thing. Indeed, the power 
of this statement immediately begs the question of how much statistical, or information 
theoretic properties can be captured in the language of differential geometry. It is clear that 
the Fisher metric captures only a small amount of information about the family of PDFs, 
however the metric is but one differential geometric structure, and one could imagine that 
more information may be translated into the language of form helds of different order. 

What we have shown here is in line with the string theory ideas of holographic duality, 
which indicate that any scale-free gauge theory should give rise to a hyperbolic geometry. 
Different scale-free gauge theories should however give rise to different held contents, above 
and beyond the metric, depending on the operators which can be formed in the gauge theory. 
As discussed in the introduction, information geometry has already been used to go from: 
gauge theory —)■ PDF —)■ metric. Thus it would be interesting, both from the information 
theoretic point of view, as well as from the holographic point of view to see what more 
differential structure can be encoded in such mappings. 

This article is our attempt to formulate a crisp statement about the uniqueness of the 
association of a metric to a probability distribution. We saw how the Fisher information 
metric took a spatially disjoint product of probability distributions to a sum of the individual 
metric tensors. We leveraged this result to entirely reverse the computation, in generality. 
In fact, we found that it is possible to explicitly construct any Riemannian metric via the 
spatially disjoint product of one-dimensional probability density functions exhibiting a select 
spatial domain symmetry. This symmety in fact features in a crucial way in our construction 
to inject dependence upon the components of the pullback used to isometrically embed the 
desired metric in E”. Moreover, up to the spatial domain symmetries mentioned and some 
mild conditions on the continuity of the probability density functions, we have shown that 
such a construction may be given in terms of arbitrary probability density functions. 

While our results appear to be quite negative in terms of the amount of information 
encoded in the Fisher metric from a PDF, we propose to interpret it as a signal that, in order 
to fully capture a duality that seems to point to a one-to-one map between string theory 
on AdS^ X and maximally supersymmetric Yang-Mills theory on the AdS boundary, a 
deeper understanding of information geometry is required. We leave this for future work. 


6. Acknowledgements 

JS and TC are grateful for the URC National Research Foundation (NRF) of South Africa 
under grant number 87667. JM acknowledges support from the NRF Competitive Support 
for Rated Researcher program under grant CPRR 90519. 


14 



References 


[1] S. Kullback, R.A.Leibler, “On Information and Sufficiency”. Ann. Math. 

Statist. 22 (1951), no. 1, pp. 79-86. doi:10.1214/aoms/1177729694. 

http://projecteuclid.org/endid.aoms/1177729694. 

[2] S.-I. Amari, “Natural gradient works efficiently in learning”. Neural Comput., vol. 10, 
no. 2, pp. 251-276, Feb. 1998. 

[3] S. -I Amari, H. Nagaoka, “Methods of information geometry. Translations of mathemat¬ 
ical monographs”, v. 191, American Mathematical Society, 2000 (ISBN 978-0821805312) 

[4] J. M. Maldacena, “The large N limit of super conformal held theories and supergravity”. 
Adv. Theor. Math. Phys.2, 231 (1998) Int. J. Theor. Phys. 38, 1113 (1999) [ArXivihep- 
th/9711200] 

[5] N. J. Hitchin, “The geometry and topology of moduli spaces”. Lecture Notes in Math¬ 
ematics Volume 1451, 1990, pp 1-48. 

[6] M. Blau, K. S. Narain and G. Thompson, “Instantons, the information metric, and the 
AdS / CFT correspondence”, hep-th/0108122. 

[7] M. Nozaki, S. Ryu, T. Takayanagi, “Holographic geometry of entanglement renormal¬ 
ization in quantum held theories”. JHEP 10 (2012) 193, [ArXiv:hep-th/1208.3469 ] 

[8] H. Matsueda, “Embedding Quantum Information into Classical Spacetime: Information 
Geometrical Perspectives on anti-de Sitter space / conformal held theory Correspon¬ 
dence” , [ArXiv:hep-th/1208.5103] 

[9] S. J. Rey and Y.Hikida “5d Black Hole as Emergent Geometry of Weakly Interacting 
4d Hot Yang-Mills Gas”. JHEP 0608 (2006) 051, [ArXiv:hep-th/0507082] 

[10] J. Heckman “Statistical Inference and String Theory”. [ArXiv: hep-th/1305.3621] 

[11] J. Nash, “The Imbedding Problem for Riemannian Manifolds”. Ann. Math. 63 (1956), 
no. 1, pp. 20-63. http://www.jstor.org/stable/1969989, 


15 



